53 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
Bengali Bulgarian Catalan Czech Standard Arabic
Availability:
Freely Available
License:
CreativeCommons, Gnu
Size:
1 GByte Production Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:If You Even Don't Have a Bit of Bible: Learning Delexicalized POS Taggers
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Zhiwei Yu | Shanghai Jiaotong University | CN | ||
| Author 2 | David Mareček | Charles University in Prague | CZ | ||
| Author 3 | Zdeněk Žabokrtský | Charles University in Prague, Faculty of Mathematics and Physics | CZ | Charles University in Prague, Institute of Formal and Applied Linguistics | None |
| Author 4 | Daniel Zeman | Charles University in Prague, Faculty of Mathematics and Physics | CZ | ||
| Main Contact | Daniel Zeman | Charles University in Prague, Faculty of Mathematics and Physics | None | Charles University, Faculty of Mathematics and Physics | None |
Documentation:
http://ufal.mff.cuni.cz/hamledt
Written
Corpus,
Language Type:
Trilingual
Languages:
Bengali Hindi Telugu
Availability:
Freely Available
License:
CreativeCommons BY 4.0
Size:
85 MByte Production Status:
Newly created-in progress
Use:
Document Classification, Text categorisation
-
Paper title:Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction
-
Paper track:Resource paper
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Nurendra Choudhary | International Institute of Information Technology, Hyderabad | IN |
| Author 2 | Rajat Singh | International Institute of Information Technology, Hyderabad | IN |
| Author 3 | Vijjini Anvesh Rao | International Institute of Information Technology, Hyderabad | IN |
| Author 4 | Manish Shrivastava | International Institute of Information Technology Hyderabad | IN |
| Main Contact | Rajat Singh | International Institute of Information Technology, Hyderabad | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Bengali
Availability:
From Data Center(s)
License:
Microsoft Research India License Agreement
Size:
7168 sentences Production Status:
Existing-used
Use:
Corpus Creation/Annotation
-
Paper title:Developing the Bangla RST Discourse Treebank
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Debopam Das | University of Potsdam | DE |
| Author 2 | Manfred Stede | University of Potsdam | DE |
| Main Contact | Debopam Das | University of Potsdam | None |
Documentation:
Bali, K., Choudhury, M., and Biswas, P. (2010). Indian language part-of-speech tagset: Bengali ldc2010t16.Language Type:
Multilingual
Languages:
Bengali
Availability:
From Owner
License:
OpenSource
Size:
197.5 KByte Production Status:
Newly created-in progress
Use:
Morphological Analysis
-
Paper title:A Neural Lemmatizer for Bengali
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Abhisek Chakrabarty | Indian Statistical Institute, Kolkata | IN |
| Author 2 | Akshay Chaturvedi | Indian Statistical Institute | IN |
| Author 3 | Utpal Garain | Indian Statistical Institute | IN |
| Main Contact | Abhisek Chakrabarty | Indian Statistical Institute, Kolkata | None |
Documentation:
There is a README file which contains the documentation
Written
Lexicon,
Language Type:
Multilingual
Languages:
Bengali
Availability:
<Not Specified>
License:
Apache 2.0
Size:
Not yet known <Not Specified>Production Status:
Newly created-in progress
Use:
Speech Synthesis
-
Paper title:TTS for Low Resource Languages: A Bangla Synthesizer
-
Paper track:Speech
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Alexander Gutkin | GB | |||
| Author 2 | Linne Ha | <Not Specified> | None | ||
| Author 3 | Martin Jansche | Google Inc. | US | ||
| Author 4 | Knot Pipatsrisawat | TH | |||
| Author 5 | Richard Sproat | US | |||
| Main Contact | Richard Sproat | None | None | None |
Documentation:
<Not Specified>
Written
Grammar/Language Model,
Language Type:
Multilingual
Languages:
Bengali
Availability:
<Not Specified>
License:
Apache 2.0
Size:
4.5 MByte Production Status:
Newly created-in progress
Use:
Speech Synthesis
-
Paper title:TTS for Low Resource Languages: A Bangla Synthesizer
-
Paper track:Speech
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Alexander Gutkin | GB | |||
| Author 2 | Linne Ha | <Not Specified> | None | ||
| Author 3 | Martin Jansche | Google Inc. | US | ||
| Author 4 | Knot Pipatsrisawat | TH | |||
| Author 5 | Richard Sproat | US | |||
| Main Contact | Richard Sproat | None | None | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Bengali
Availability:
Not Applicable
License:
N/A
Size:
27 million words Production Status:
Existing-used
Use:
Corpus Creation/Annotation
-
Paper title:Developing the Bangla RST Discourse Treebank
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Debopam Das | University of Potsdam | DE |
| Author 2 | Manfred Stede | University of Potsdam | DE |
| Main Contact | Debopam Das | University of Potsdam | None |
Documentation:
Al Mumin, M. A., Shoeb, A. A. M., Selim3, M. R., and Iqbal, M. Z. (2014). Sumono: A representative modern bengali corpus. SUST Journal of Science and Technology, 21(1):78–86.Language Type:
Multilingual
Languages:
Bengali
Availability:
<Not Specified>
License:
Apache 2.0
Size:
Not yet known <Not Specified>Production Status:
Newly created-in progress
Use:
Speech Synthesis
-
Paper title:TTS for Low Resource Languages: A Bangla Synthesizer
-
Paper track:Speech
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Alexander Gutkin | GB | |||
| Author 2 | Linne Ha | <Not Specified> | None | ||
| Author 3 | Martin Jansche | Google Inc. | US | ||
| Author 4 | Knot Pipatsrisawat | TH | |||
| Author 5 | Richard Sproat | US | |||
| Main Contact | Richard Sproat | None | None | None |
Documentation:
<Not Specified>
Speech
Corpus,
Language Type:
Multilingual
Languages:
Bengali
Availability:
From Data Center(s)
License:
IARPA Babel Bengali Agreement (Not-For-Profit), IARPA Babel Bengali Agreement (For-Profit), IARPA Babel Bengali Agreement (Non-Member)
Size:
215 hours Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Developing the Bangla RST Discourse Treebank
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Debopam Das | University of Potsdam | DE |
| Author 2 | Manfred Stede | University of Potsdam | DE |
| Main Contact | Debopam Das | University of Potsdam | None |
Documentation:
Bills, A., David, A., Dubinski, E., Fiscus, J., Gillies, B., Harper, M., Jarrett, A., Molina, M., Ray, J., Rytting, A., Paget, S., Shen, W., Silber, R., Tzoukermann, E., and Wong, J. (2016). Iarpa babel bengali language pack iarpa-babel103b-v0.4b.
Speech
Corpus,
Language Type:
Multilingual
Languages:
Bengali Gujarati Hindi Kannada Malayalam Odia Rajasthani Tamil Telugu
Availability:
License:
Creative Commons
Size:
None Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:Generic Indic Text-to-speech Synthesisers with Rapid Adaptation in an End-to-end Framework
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Anusha Prakash | indic TTS | /N |
Documentation:
None




